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Abstract Body 


Background / Context: 

Research on gender achievement gaps shows they exist, and are largest in the tails of the 
distribution, starting as early as Kindergarten and persisting through eighth grade. 

In mathematics, studies find small average gender achievement gaps and larger 
systematically male-favoring gaps among the highest achieving students. Using the Early 
Childhood Longitudinal Study-Kindergarten (ECLS-K) 1998-99 cohort data, multiple studies 
find that an average mathematics gap favoring boys emerges by the end of 1st grade, grows to 
approximately 0.20 SD by the end of 3rd grade, and persists through 5th grade (Lee, Moon and 
Hegar, 2011; Husain and Millimet, 2009; Fryer and Levitt, 2009; Robinson and Lubienski, 2011; 
Sohn, 2012). Additionally, work using ECLS-K finds small, but significant gaps exist at the 
upper tail of the mathematics score distribution starting as early as Kindergarten, and that these 
gaps grow and extend to the rest of the distribution by 3rd grade (Husain & Millimet, 2009; 
Robinson & Lubienski, 2011; Fryer & Levitt, 2009). 

In contrast, in ELA, studies show that a substantial average gap favoring females exists in 
ELA as early as the fall of Kindergarten and that this gap remains fairly static through fifth grade 
using the ECLS-K data (Husain and Millimet, 2009; Robinson and Lubienski, 2011; Fryer and 
Levitt, 2009; Chatterji 2006). Similar to mathematics, prior work also find that it is critical to 
examine the tails of the distribution. The gap among the lowest achieving students appears to be 
larger than the average gap for all grades (Robinson and Lubienski, 2011). 

Despite showing evidence that the average gaps and gaps among the highest and lowest 
achieving students do exist across grades, potentially changing in magnitude and significance as 
students matriculate from Kindergarten through eighth grade, prior research does not explicitly 
model these changes in the gaps across grades. 

Purpose / Objective / Research Question / Focus of Study: 

This paper seeks to understand underlying patterns in how gender achievement gaps 
grow, shrink or stay constant as students move through elementary school and middle school. 

The focus of this paper is not only to capture these trends for average gender achievement gaps, 
but also for gaps in the tails of the achievement distribution as those have been demonstrated 
critical areas of analysis by prior work in the field and may exhibit different patterns than those 
at the mean. Understanding when gaps change, and in what direction, provides important 
information about the mechanisms that may produce the changes the gaps and highlight critical 
time-points for intervention, sparking future lines of research in this field. 

Setting: 

This study focuses on gender achievement gaps within the United States, with particular 
attention to how these gaps change as students progress from first through eighth grades. It is a 
longitudinal study of changes in the gaps over grades, terms and cohorts. The gaps are estimated 
at the district level, providing a higher resolution analysis of patterns of change. 

Population / Participants / Subjects: 
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The population of study is U.S. school districts serving first through eighth grade students 
that use the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) 
assessment. The MAP assessment is a low-stakes formative assessment designed to help teachers 
better evaluate and assist their students throughout the school year. Approximately 3,700 school 
districts across the U.S. use the MAP assessment. The data include fall and spring test scores for 
nine grades (Kindergarten through 8th grade) over nine school years (2005 through 2013) with 
test records for the approximately 7 million students, 15,000 schools, and 3,700 districts in the 
dataset from all U.S. states. The Kindergarten school year is not used in this study due to the 
significantly reduced administration of the test in that grade across districts. 

This sample is not representative of all U.S. elementary and middle school districts; 
however, due to the large number of participating districts it is a useful starting point to 
understand changes in the gaps across grades, particularly in states where a majority of districts 
participate. Additionally, this data offers the unique advantage of having both fall and spring test 
scores for a subset of the districts, which enables the investigation of how gaps change during the 
summer as compared with the school year that has not been the focus on gender achievement 
research to date. 


Intervention / Program / Practice: 

This study is a secondary data analysis of the MAP assessment data. It focuses on 
characterizing how district-level gender achievement gaps change as students matriculate from 
first through eighth grade. 


Research Design: 


To enable comparability across grades, this study uses metric-free measures of the mean 
and tail gender achievement gap, specifically the V-Statistic (Ho, 2009; Ho & Haertel, 2006; Ho 
& Reardon, 2012) and the Proportion- Adjusted Relative Difference (Robinson & Lubienski, 
2011). The E-statistic is calculated as: V — V2 * 0~ 1 (P a <b) where 0 -1 is the inverse 
cumulative density function for the standard normal distribution. V is interpretable as the gap in 
common standard deviation units between two groups, and is invariant to monotone scale 
transformations. It is further equal to Cohen’s d when the group distributions are normal (with 
either equal or unequal variances), yielding an interpretation as a scale-invariant effect size that 
is comparable across tests, times or grades (Ho, 2009; Ho and Haertel, 2006, Ho and Reardon, 
2012; Reardon and Ho, 2015). 

The Proportion-Adjusted Relative Difference (A g ) is percentile-specific proportion, 
which provides information about the representation of males compared with females at different 
points (or percentiles) of the distribution. 




An = { 


0 m (9) + 0 f (9) 

1 -tfy(0) 

2 - [0 m (6) + 0 f (6)\ 


if 6 < 50 


if 6 > 50 


where 0 m (.9) and 0f(J9 ) are the cumulative density functions for males and females at the 


percentile, 9. For percentiles less than 50 (greater or equal to 50), A e is the proportion of 
students scoring at or below (above) the given percentile who are male (female). Note that A e — 
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0.5 indicates gender parity, A g > 0.5 indicates a female advantage, and A g < 0.5 indicates a 
male advantage, where advantage is defined as underrepresentation in the lower tail and over 
representation in the upper tail of the achievement distribution. 

The use of metric free measures enables the comparison of gender achievement gaps 
across grades without concerns about vertical-scaling or comparable test metrics, that unmet 
would distort the trend analysis. 

Data Collection and Analysis: 

This study leverages the longitudinal nature of the NWEA MAP assessment data in order 
to track the district-level gaps over grades and terms for multiple cohorts. Three different types 
district gaps for each subject-by-grade-by-term-by-cohort case are estimated: (1) a mean gap; (2) 
a lower-tail gap ( 6 — 10); and (3) an upper-tail gap ( 9 — 90) using the P -statistic and 
proportion-adjusted relative difference, described above. To analyze the patterns in these three 
metric-free gaps across grades, this study adopts a change in gap framework, modeling changes 
in the subject-by-grade-by-tenn-by-cohort gaps using Hierarchical Linear Modeling (HLM) with 
multiple functional forms for the grade and tenn variables. The baseline, linear grade -trend 
model is as follows: 

Gdgtc dgtc T € dgtc 

n dgtc ~ Poo d + PlOO 9 + /?200 c + /?30o/ + e dgtc 

Pood ~ Y 000 + u 00 d 

where Gf gtc is the V gap or A g statistic in subject, s, for a district, d, grade, g, term, t, and 
cohort, c; g* is the student grade level grand mean centered at 5.5; c* is the year in which the 
students were in Kindergarten grand mean centered at 2004.5; and, / is an indicator variable for 
whether the test was administered in the fall tenn. The models are precision-weighted to account 
for the fact the gaps are estimated. Additional specifications of this model include: (1) a fully 
non-parametric specification for grade-by-tenn (i.e. indicators for each grade-by-tenn 
combination) and (2) including random coefficients on the linear and non-parametric grade-term 
trends. 

Investigating multiple functional forms (i.e. linear, non-parametric) stems from trying to 
narrow the hypothesized mechanisms by understanding if gaps consistently increase or decrease 
over grades (linear trend) or if the changes are more sporadic (nonparametric trend). Including 
random slopes enables the characterization in the variability across districts in these trends. 

Findings / Results: 

This study finds that gender achievement gaps in both mathematics and reading change 
meaningfully as students progress through grades (please insert figure 3 here). Specifically, 
changes in the gaps are best captured using a fully nonparametric trend on grade-by-term. Use of 
a linear trend masks the period of increase followed by a period of decrease in the gaps at 
different grades. In both subjects, the trend in the average gaps that emerges is that gaps change 
in favor of males in elementary school (growing the mathematics gap, shrinking the reading 
gap), and in favor of females in middle school (shrinking the mathematics gap, growing the 
reading gap). These trends hold for the upper and lower tail gaps, as well, with the exception of 
the lower tail in reading where male students continuously fall behind. 

Conclusions: 
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This paper adds to the current literature through systematically analyzing and 
characterizing the changes in the gaps as students move through elementary and middle school. 
Particularly it offers two key advantages: (1) the use of district-level longitudinal data with 
multiple cohorts across eight grades provides robustness to the observed results; and (2) the use 
of metric-free measures ensures comparability of the gap-sizes across grades. It further provides 
intuition as to the potential mechanisms driving these changes, sparking critical lines of future 
research. Planned extensions include jointly modeling the subjects due to similar trends across 
the gaps, uncovered in the separate subject analyses, and further investigating the differences in 
summer and school year gap trends. 
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Appendix B. Tables and Figures 


Figure 1 : Dimensions of Gender Achievement Gaps 
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Notes: Reading gaps are plotted on the x-axis and corresponding mathematics gaps on the y-axis. Positive 
(negative) values indicate gaps are male-favoring (female-favoring). Gender equality in the gaps is at the 
origin. 
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Figure 2: Male-Female Mathematics and ELA Achievement Gaps, School Districts 2009- 
2012 
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Notes: Reading gaps are plotted on the x-axis and corresponding mathematics gaps on the y-axis. Positive 
(negative) values indicate gaps are male-favoring (female-favoring). The model used to estimate average 
gaps includes state fixed effects and adds the average state NAEP gap to the Empirical Bayes estimate. 
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Table 1: Relationship between Proportion Multiple-Choice Items on State Tests and the Size of 
Gender Gaps, State-Level 



Mathematics 


ELA 


Grade 4 

Grade 8 

Grade 4 

Grade 8 

Model 1: State-Level NAEP Audit Test 

Proportion Short Response+ 

-0.135 ** 

-0.109 

-0.223 

* -0.376 ** 

Extended Response 

(0.041) 

(0.075) 

(0.084) 

(0.113) 

Model 2: District-Level NWEA Audit Test 

Proportion Short Response+ 

-0.126 

-0.151 * 

-0.296 ** -0.389 *** 

Extended Response 

(0.122) 

(0.068) 

(0.098) 

(0.101) 


All models are weighted by 1/se 2 . Standard errors that are clustered by state. Model 1 includes data from 2009 Ed 
Facts and NAEP data from grades 4 and 8. Model 2 data from 2009 Ed Facts and NWEA data sources from grades 4 
and 8. The models are restricted to state or district by grade cells with gap data from both Ed Facts and 
NAEP/NWEA. Both models also include the proportion of "other" (not shown) items. 
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Figure 3: Trend in Male-Female Achievement Gaps over Grades 


Trend in Male-Female Achievement Gaps over Grades 



Grade 


Notes: Estimated trends are taken from 3 -level precision weighted HLM models with a non-parametric grade-by- 
term 
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